MersV1, Main, Exploration, bibRecord, 002891

Efficient algorithms for the discovery of DNA oligonucleotide barcodes from sequence databases

Identifieur interne : 002891 ( Main/Exploration ); précédent : 002890; suivant : 002892

Efficient algorithms for the discovery of DNA oligonucleotide barcodes from sequence databases

Auteurs : M. Zahariev ; V. Dahl ; W. Chen ; C. A. Lévesque

Source :

Molecular Ecology Resources [ 1755-098X ] ; 2009-05.

RBID : ISTEX:85AC8DD5A083C482925EB77B2415A01CB3C9FA0C

English descriptors

Teeft :
- Algorithm, Ambiguous nucleotides, Ambiguous oligonucleotide, Ambiguous subsequence, Annual review, Array designer, Average case, Barcode, Barcode oligonucleotides, Barcodes, Blackwell publishing, Blot hybridization, Brute force, Brute force approach, Clade, Cock awam, Computational, Computational complexity, Consistent complexity, Current subsequence, Database, Enumerate, Environmental microbiology, Equivalence relation, Experimental temperature, First subsequence, Folder, Functional genomics, Group counter, Group oligonucleotide barcode, Group oligonucleotide barcodes, Hash tables, High specificity, Hybridization, Large data sets, Large databases, Large pool, Linear comparison, Middle position, Molecular ecology resources, Node, Nucleic acids research, Nucleotide, Oligonucleotide, Oligonucleotide array, Oligonucleotide barcode candidate, Oligonucleotide barcodes, Oligonucleotide length, Oligonucleotide locations, Oligonucleotides, Other sequences, Penicillium, Penicillium subgenus penicillium, Pythium species, Second subsequence, Second subsequence leaf, Seifert, Siam journal, Signature oligonucleotides, Sigoli, Single nucleotide polymorphism, Software, Specific oligonucleotides, Subsequence, Such subsequences, Suffix arrays, Suffix trees, Target sequences, Thermodynamic properties, Thomma bphj, Total order relation, Unambiguous nucleotides, Unambiguous subsequence, Unambiguous subsequences, Unique oligonucleotides, Unmarked elements, Whole tree, Worst case.

Abstract

Efficient design of barcode oligonucleotides can lead to significant cost reductions in the manufacturing of DNA arrays. Previous methods are based on either a preliminary alignment, which reduces their efficiency for intron‐rich regions, or on a brute force approach, not feasible for large‐scale problems or on data structures with very poor performance in the worst case. One of the algorithms we propose uses ‘oligonucleotide sorting’ for the discovery of oligonucleotide barcodes of given sizes, with good asymptotic performance. Specific barcode oligonucleotides with at least one base difference from other sequences in a database are found for each individual sequence. With another algorithm, specific oligonucleotides can also be found for groups or clades in the database, which have 100% homology for all oligonucleotide sequences within the group or clade while having differences with the rest of the data. By re‐organizing the sequences/groups in the database, oligonucleotides for different hierarchical levels can be found. The oligonucleotides or polymorphism locations identified as species or clade specific by the new algorithm are refined and screened further for hybridization thermodynamic properties with third party software.

Url:

https://api.istex.fr/ark:/67375/WNG-FN9C5627-H/fulltext.pdf

DOI: 10.1111/j.1755-0998.2009.02651.x

Affiliations:

Links toward previous steps (curation, corpus...)

to stream Istex, to step Corpus: 001350
to stream Istex, to step Curation: 001350
to stream Istex, to step Checkpoint: 000688
to stream Main, to step Merge: 002917
to stream Main, to step Curation: 002891

Le document en format XML

<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Efficient algorithms for the discovery of DNA oligonucleotide barcodes from sequence databases</title>
<author><name sortKey="Zahariev, M" sort="Zahariev, M" uniqKey="Zahariev M" first="M." last="Zahariev">M. Zahariev</name>
</author>
<author><name sortKey="Dahl, V" sort="Dahl, V" uniqKey="Dahl V" first="V." last="Dahl">V. Dahl</name>
</author>
<author><name sortKey="Chen, W" sort="Chen, W" uniqKey="Chen W" first="W." last="Chen">W. Chen</name>
</author>
<author><name sortKey="Levesque, C A" sort="Levesque, C A" uniqKey="Levesque C" first="C. A." last="Lévesque">C. A. Lévesque</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:85AC8DD5A083C482925EB77B2415A01CB3C9FA0C</idno>
<date when="2009" year="2009">2009</date>
<idno type="doi">10.1111/j.1755-0998.2009.02651.x</idno>
<idno type="url">https://api.istex.fr/ark:/67375/WNG-FN9C5627-H/fulltext.pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001350</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">001350</idno>
<idno type="wicri:Area/Istex/Curation">001350</idno>
<idno type="wicri:Area/Istex/Checkpoint">000688</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Checkpoint">000688</idno>
<idno type="wicri:doubleKey">1755-098X:2009:Zahariev M:efficient:algorithms:for</idno>
<idno type="wicri:Area/Main/Merge">002917</idno>
<idno type="wicri:Area/Main/Curation">002891</idno>
<idno type="wicri:Area/Main/Exploration">002891</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main">Efficient algorithms for the discovery of DNA oligonucleotide barcodes from sequence databases</title>
<author><name sortKey="Zahariev, M" sort="Zahariev, M" uniqKey="Zahariev M" first="M." last="Zahariev">M. Zahariev</name>
<affiliation><wicri:noCountry code="subField"></wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Dahl, V" sort="Dahl, V" uniqKey="Dahl V" first="V." last="Dahl">V. Dahl</name>
<affiliation><wicri:noCountry code="subField"></wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Chen, W" sort="Chen, W" uniqKey="Chen W" first="W." last="Chen">W. Chen</name>
<affiliation></affiliation>
<affiliation><wicri:noCountry code="subField">5B6</wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Levesque, C A" sort="Levesque, C A" uniqKey="Levesque C" first="C. A." last="Lévesque">C. A. Lévesque</name>
<affiliation></affiliation>
<affiliation><wicri:noCountry code="subField">5B6</wicri:noCountry>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="j" type="main">Molecular Ecology Resources</title>
<title level="j" type="sub">Special Issue on Barcoding Life</title>
<title level="j" type="alt">MOLECULAR ECOLOGY RESOURCES</title>
<idno type="ISSN">1755-098X</idno>
<idno type="eISSN">1755-0998</idno>
<imprint><biblScope unit="vol">9</biblScope>
<biblScope unit="issue">s1</biblScope>
<biblScope unit="page" from="58">58</biblScope>
<biblScope unit="page" to="64">64</biblScope>
<biblScope unit="page-count">7</biblScope>
<publisher>Blackwell Publishing Ltd</publisher>
<pubPlace>Oxford, UK</pubPlace>
<date type="published" when="2009-05">2009-05</date>
</imprint>
<idno type="ISSN">1755-098X</idno>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">1755-098X</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="Teeft" xml:lang="en"><term>Algorithm</term>
<term>Ambiguous nucleotides</term>
<term>Ambiguous oligonucleotide</term>
<term>Ambiguous subsequence</term>
<term>Annual review</term>
<term>Array designer</term>
<term>Average case</term>
<term>Barcode</term>
<term>Barcode oligonucleotides</term>
<term>Barcodes</term>
<term>Blackwell publishing</term>
<term>Blot hybridization</term>
<term>Brute force</term>
<term>Brute force approach</term>
<term>Clade</term>
<term>Cock awam</term>
<term>Computational</term>
<term>Computational complexity</term>
<term>Consistent complexity</term>
<term>Current subsequence</term>
<term>Database</term>
<term>Enumerate</term>
<term>Environmental microbiology</term>
<term>Equivalence relation</term>
<term>Experimental temperature</term>
<term>First subsequence</term>
<term>Folder</term>
<term>Functional genomics</term>
<term>Group counter</term>
<term>Group oligonucleotide barcode</term>
<term>Group oligonucleotide barcodes</term>
<term>Hash tables</term>
<term>High specificity</term>
<term>Hybridization</term>
<term>Large data sets</term>
<term>Large databases</term>
<term>Large pool</term>
<term>Linear comparison</term>
<term>Middle position</term>
<term>Molecular ecology resources</term>
<term>Node</term>
<term>Nucleic acids research</term>
<term>Nucleotide</term>
<term>Oligonucleotide</term>
<term>Oligonucleotide array</term>
<term>Oligonucleotide barcode candidate</term>
<term>Oligonucleotide barcodes</term>
<term>Oligonucleotide length</term>
<term>Oligonucleotide locations</term>
<term>Oligonucleotides</term>
<term>Other sequences</term>
<term>Penicillium</term>
<term>Penicillium subgenus penicillium</term>
<term>Pythium species</term>
<term>Second subsequence</term>
<term>Second subsequence leaf</term>
<term>Seifert</term>
<term>Siam journal</term>
<term>Signature oligonucleotides</term>
<term>Sigoli</term>
<term>Single nucleotide polymorphism</term>
<term>Software</term>
<term>Specific oligonucleotides</term>
<term>Subsequence</term>
<term>Such subsequences</term>
<term>Suffix arrays</term>
<term>Suffix trees</term>
<term>Target sequences</term>
<term>Thermodynamic properties</term>
<term>Thomma bphj</term>
<term>Total order relation</term>
<term>Unambiguous nucleotides</term>
<term>Unambiguous subsequence</term>
<term>Unambiguous subsequences</term>
<term>Unique oligonucleotides</term>
<term>Unmarked elements</term>
<term>Whole tree</term>
<term>Worst case</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Efficient design of barcode oligonucleotides can lead to significant cost reductions in the manufacturing of DNA arrays. Previous methods are based on either a preliminary alignment, which reduces their efficiency for intron‐rich regions, or on a brute force approach, not feasible for large‐scale problems or on data structures with very poor performance in the worst case. One of the algorithms we propose uses ‘oligonucleotide sorting’ for the discovery of oligonucleotide barcodes of given sizes, with good asymptotic performance. Specific barcode oligonucleotides with at least one base difference from other sequences in a database are found for each individual sequence. With another algorithm, specific oligonucleotides can also be found for groups or clades in the database, which have 100% homology for all oligonucleotide sequences within the group or clade while having differences with the rest of the data. By re‐organizing the sequences/groups in the database, oligonucleotides for different hierarchical levels can be found. The oligonucleotides or polymorphism locations identified as species or clade specific by the new algorithm are refined and screened further for hybridization thermodynamic properties with third party software.</div>
</front>
</TEI>
<affiliations><list></list>
<tree><noCountry><name sortKey="Chen, W" sort="Chen, W" uniqKey="Chen W" first="W." last="Chen">W. Chen</name>
<name sortKey="Dahl, V" sort="Dahl, V" uniqKey="Dahl V" first="V." last="Dahl">V. Dahl</name>
<name sortKey="Levesque, C A" sort="Levesque, C A" uniqKey="Levesque C" first="C. A." last="Lévesque">C. A. Lévesque</name>
<name sortKey="Zahariev, M" sort="Zahariev, M" uniqKey="Zahariev M" first="M." last="Zahariev">M. Zahariev</name>
</noCountry>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Main/Exploration

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002891 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 002891 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:85AC8DD5A083C482925EB77B2415A01CB3C9FA0C
   |texte=   Efficient algorithms for the discovery of DNA oligonucleotide barcodes from sequence databases
}}

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021

	Serveur d'exploration MERS
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration MERS

Efficient algorithms for the discovery of DNA oligonucleotide barcodes from sequence databases

Efficient algorithms for the discovery of DNA oligonucleotide barcodes from sequence databases

Source :

English descriptors

Abstract

Links toward previous steps (curation, corpus...)

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri